cloud datacenter
Sustainable Carbon-Aware and Water-Efficient LLM Scheduling in Geo-Distributed Cloud Datacenters
Moore, Hayden, Qi, Sirui, Hogade, Ninad, Milojicic, Dejan, Bash, Cullen, Pasricha, Sudeep
In recent years, Large Language Models (LLM) such as ChatGPT, CoPilot, and Gemini have been widely adopted in different areas . As the use of LLMs continues to grow, many efforts have focused on reducing the massive training overheads of these models. But it is the environmental impact of handling user requests to LLMs that is increasingly becoming a concern. Recent studies estimate that the costs of operating LLMs in their inference phase can exceed training costs by 25 per year. A s LLMs are queried incess antly, the cumulative carbon footprint for the operational phase has been shown to far exceed the footprint during the training phase. Further, estimates indicate that 500 ml of fresh water is expended for every 20 - 50 requests to LLMs during inference. To address these important sustainability issues with LLMs, we propose a novel framework called SLIT to co - optimize LLM quality of service (time - to - first token), carbon emissions, water usage, and energy costs . The framework utilizes a machine learning (ML) based metaheuristic to enhance the sustainability of LLM hosting across geo - distributed cloud datacenters. Such a framework will become increasingly vital as LLMs proliferate.
- North America > United States > Colorado (0.04)
- Oceania (0.04)
- Europe > Western Europe (0.04)
- Asia > East Asia (0.04)
- Information Technology > Services (1.00)
- Energy > Power Industry (1.00)
Bringing AI to the Edge
This year, U.S. rail carrier Amtrak will be installing two novel inspection gateways from Duos Technologies along its busy Northeast Corridor. The barn-like Duos structures straddle railway tracks; as passenger trains speed through at up to 125 miles per hour, 97 cameras and dozens of LED lights arrayed around the sides, top, and bottom of the tracks will capture thousands of high-resolution images of the railcars. These images are aggregated and processed on site in real time to present a complete, 360-degree, highly detailed view of the train. Artificial intelligence (AI) algorithms running on Nvidia GPUs will analyze the images locally; if the model flags a potential structural or mechanical flaw, train personnel will be notified in less than a minute. The Duos portal is one of many new examples of what is loosely categorized as edge AI, or the deployment and operation of AI models outside of massive cloud datacenters.
- Europe > Finland > Northern Ostrobothnia > Oulu (0.05)
- North America > United States > California > Alameda County > Berkeley (0.05)
- Europe > Netherlands (0.05)
Radium looks to speed up AI and ML jobs in cloud datacenters
Today, Radium, a startup that aims to use artificial intelligence and machine learning to extract more computing power from cloud hardware, announced it was leaving stealth mode and deploying its solutions to cloud datacenters run by Cyxtera in Toronto, the New York and New Jersey metro area, and Silicon Valley. The main product, called Launchpad, lets users start and shut down projects on bare metal machines, eliminating the extra layers of hypervisors and virtualization software. Radium offered benchmark tests on machine learning jobs that showed speed increases ranging from 30% and 140%. "Our initial testing shows that bare metal servers offer a good cloud computing platform for the high-performance deep learning and inference workloads required for these types of applications," said Srinivasa Narasimhan, a professor at Carnegie Mellon's School of Computer Science, who has been working with the company to test its product. Many cloud products rely heavily on virtualization software layers, or "hypervisors," that allow one physical machine to simulate a variety of smaller machines that appear independent to users.
- North America > United States > New York (0.26)
- North America > United States > New Jersey (0.26)
- North America > United States > California (0.26)
- North America > Canada > Ontario > Toronto (0.26)
Cloud datacenters anticipated to become largely robot-dependent by 2025
In a strong endorsement to the value of artificial intelligence and machine learning, research firm Gartner predicts that half of cloud datacenters will be leveraging advanced robots by 2025. Gartner believes these AI-centered deployments will ramp up the operating efficiency of datacenters by a margin of 30%. So what role will robots play in cloud datacenters? Why are these robots considered so vital, and what motivates businesses to adopt them at such a robust pace? The typical workflow in a cloud datacenter comprises a host of different actions.